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A SCALING ANALYSIS OF A TRANSIENT STOCHASTIC 

NETWORK (I) 

MATHIEU FEUILLET AND PHILIPPE ROBERT 



J , Abstract. In this paper, a simple transient Markov process with an absorbing 

frt ' point is used to investigate the quaUtative behavior of a large scale storage 

network of non reliable file servers where files can be duplicated. When the 
size of the system goes to infinity it is shown that there is a critical value for the 
maximum number of files per server such that below this quantity, the system 
stays away from the absorbing state, all files lost, in a quasi-stationary state 
where most files have a maximum number of copies. Above this value, the 
network looses a significant number of files until some equilibrium is reached. 
When the network is stable, it is shown that, with convenient time scales, the 
evolution of the network towards the absorbing state can be described via a 
stochastic averaging principle. 
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1. Introduction 

Storage systems. One considers a large scale storage system, it is a set of file 
servers in a commmiication network. In order to ensure persistence, files are du- 
plicated on several servers. When the disk of a given server breaks down, its files 
are lost but can be retrieved on the other servers if copies are available. For these 
architectures a fraction of the bandwidth of a server is devoted to the duplication 
mechanism of its files to other servers. On one hand, there should be sufficiently 
many copies so that any file has a copy available on at least one server at any time. 
On the other hand, in order to use the bandwidth in an optimal way, there should 
not be too many copies of a given file so that the network can accommodate a 
large number of distinct files. These systems are known as distributed hash tables 
(DHTs), they play an important role in the development of some large scale dis- 
tributed systems, see Rhea et al. [28] and Rowstron and Druschel [30] for a more 
detailed presentation. 
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Failures of disks occur naturally randomly, these events are quite rare but, given 
the large number of nodes of these distributed systems, this is not a negligible 
phenomenon at the level of the network. If, for a short period of time, several of 
the servers break down, it may happen that files will be lost for good just because 
all the available copies were on these servers and because the recovery procedure 
was not completed before the last copy disappeared. To design such a system, it 
is therefore desirable to find a convenient duplication policy and to dimension the 
system so that all files will have at least a copy as long as possible. The natural 
critical parameters of the network are the failure rates of servers, the bandwidth 
allocated to duplication, the number of files and the number of servers. The ratio 
of the two last quantities being a measure of the storage capacity of the system. It 
is important to understand the impact of each of these parameters on the efficiency 
of the storage system. 

Stochastic Models. This network can be seen as a classical set of queues with 
breakdowns. Numerous stochastic models of such systems have been investigated 
in the literature, see Chapter 6 of King (55] for example and the references therein. 
Related models concern queues with retrial and queues with servers of walking 
types, see Artalejo and Gomez-Corral [2] and Falin and Templeton [10]. For most 
of the systems analyzed, there are, in general, one or two nodes which are subject 
to breakdowns. A queueing analysis is generally done in this context: convergence 
in distribution of the associated Markov model and analysis of the distribution of 
the availability of the system, of the delays and of queue sizes, . . . For DHTs, the 
rare stochastic models to investigate their performances describe the evolution of 
the number of copies of a given file. See Chun et al. [5], Picconi et al. [25] and 
Ramabhadran and Pasquale [26]. See also Feuillet and Robert [13]. In most of 
these studies the interaction between different files, due to the bandwidth shar- 
ing limitations, has not been really considered, except through simulations. The 
purpose of this paper is to investigate the impact of this interaction. The second 
important aspect is that a large system, i.e. with a large number of files, will be 
considered instead of a small number of elements. This assumption is quite natural 
for current distributed systems. 

More precisely, the following simple model is considered: A file can have at most 
two copies, the total bandwidth allocated to file duplication is given by AA'', for 
A > and N G N. If at some moment there are x > 1 files with exactly one copy, a 
new copy of each of these files is created at rate XN/x. It is assumed that initially 
F/v files are present in the system with two copies and each copy of a file disappears 
at rate ^. Recall that a file with copies is lost. It will be assumed that the total 
number of files Fn is proportional to A^, i.e. that F^/N converges to some /3 > 0. 
Clearly enough, this system is transient and the empty state, all files are lost, is an 
absorbing state. The aim of this paper is of describing the decay of the network, 
i.e. how the set of lost files in increasing. For i5 > 0, there exists some finite random 
instant T/v((5), such that a fraction 16N\ of the files are lost after time T/v((5). The 
paper investigates the order of magnitude in A^ of the variables T/v((5) as A^ gets 
large and the role of the parameters A, /z and (3 in these asymptotics. 

In practice, if there are N servers and that each of them has an available band- 
width A to duplicate files, the maximal capacity for duplication is then AA^. The 
model described above has therefore an optimal use of the duplication mechanism 
since the maximal duplication capacity is always available. For this reason this 
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model provides upper bounds on the optimal performances of such a system. In 
particular, for any duplication mechanisms, after a duration of time with the same 
distribution as Tn{6), at least [5N\ files will be lost for good. A more realistic 
model, when the total duplication bandwidth is not anymore centralized, is investi- 
gated in Feuillet and Robert [14] via mean- field limit asymptotics. It turns out that 
the corresponding mean-field limit can in fact be expressed in terms of the simple 
model analyzed in this paper. The more general case when there are at most d > 2 
copies of a given file will be investigated in another sequel to this paper. 

Time Scales of Transient Markov Processes. If, for i e {0,1,2}, X^{t) de- 
notes the number of files with i copies in the network, then, under Poisson assump- 
tions for failures and for duplication processes, {X^{t),X(^{t)) is clearly a finite 
Markov process with {Fn,0) as an absorbing state. At the difference of previous 
works mentioned above, there is clearly no question of equilibrium here since the 
system dies at (_FV,0). A possible approach to investigate the decay of such a 
system could be of considering the associated quasi-stationary distributions of the 
Markov process. See Darroch and Seneta [6] and Ferrari et al. [11 for example. It 
would give a description of the system conditionally on the event that only a frac- 
tion of the files has been lost. These quantities are generally expressed in terms of 
the spectral characteristics of the jump matrix. For this reason, explicit description 
of these distributions are quite rare outside one dimensional birth and death pro- 
cesses. In this paper, different time scales will be used to investigate the qualitative 
behavior of these transient processes. Times scales can be thought as "lenses" , two 
of them that will focus on the stable part of the sample path of the process (if any), 
this will give at the same time a kind of associated quasi-stationary distribution. 
Finally, a third time scale will focus on the decaying part of the sample paths, i.e. 
when the proportion of lost files is steadily increasing. 

Stochastic Averaging Principles. It is shown that in some cases, a stochastic 
averaging principle (SAP) occurs for this transient process: roughly speaking its 
dynamics can be decomposed into two components, one evolving on a fast time 
scale and the other one on a slower time scale. The system is fully coupled in the 
sense that the jump rates of the slow process depends on the equilibrium of the 
fast process, and the jump rates of the fast process depends of the state of the slow 
process. See Khasminskii [21] and Freidlin and Wentzell [15]. This phenomenon 
is known to occur for the classical example of loss networks. In this case the 
vector of the number of free places of the congested links is the fast component, see 
Kelly [20] and Hunt and Kurtz ^E\. Outside this class of networks, there are, up 
to now, few examples of stochastic networks for which a fully coupled SAP occurs. 
See Feuillet [12] and Perry and Whitt [24] for recent examples of SAP. 

This SAP phenomenon is already well known in the framework of deterministic 
dynamical systems, see Guckenheimer and Holmes [16]. In a stochastic context, 
an additional difficulty, sometimes underestimated, is of controlling the regularity 
properties of the family of invariant distributions indexed by the states of the slow 
process, instead of the family of fixed points in the deterministic case. This can 
be done through a kind of uniform control of some ergodic averages, see Freidlin 
and Wentzell [15j or by using a martingale representation of the associated Markov 
processes, see Kurtz [23]. In any case, there are several delicate technical issues to 
address: a convenient tightness result for a set of random measures and the rate 
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of convergence of ergodic averages. In this paper, a martingale formulation is also 
used but with a technical background significantly reduced. By taking a convenient 
state space for random measures, technical results related to extensions of random 
measures with specific measurability properties are not necessary. Furthermore, 
the tightness of the family of invariant distributions of fast processes is obtained as 
a consequence of a simple monotonicity property. If the monotonicity property is 
quite specific, it seems that the method to avoid extension results can be used in a 
quite general framework. This will be the subject of further investigations. 

Outline of the Paper. Section [2] introduces the Markov process investigated and 
its corresponding martingale representation. Section |3] studies a fluid picture of the 
network, i.e. the limit of the sequence of processes {X^{t)/N, X^{t)/N), it is shown 
in Theorem [T] that its limit, the solution of an ODE, is not trivial when A < 2/^/3 
and is (0, 0) when A > 2^/3. The storage system is therefore properly designed when 
A > 2/i/3, otherwise it is inefficient since it is losing a significant number of files right 
from the beginning. Section |4] is devoted to the critical case A — 2/i/3, Theorem [2] 
shows that the sequence of processes {X^ it) / \/N , Xf {t) / \/N) is converging in 
distribution and that its limit can be expressed in terms of a non-Markovian one- 
dimensional process, solution of an unusual stochastic differential equation with 
reflection at 0. In Section [SJ the stable case A > 2/i/3 is investigated. It is shown 
that the capacity of the system remains intact at the normal time scale: For i > 0, 
Theorem [3] proves that the variable {Xq (t)) converges in distribution to a Poisson 
process. Only a finite number of files is lost as N goes to infinity. More interesting. 
Theorem 2] shows that on the time scale t — > Nt the transience of the Markov 
process shows up: at "time" Nt a fraction ijj{t)N of the files is lost where ip{t) 
is the solution of some fixed point equation. This is the case where a stochastic 
averaging principle holds: around time Nt there is a local equilibrium for which 
(/? — '^{t))N files are still available. As a consequence, t — >■ Nt is the convenient 
time scale to observe the degradation of the storage system. The proof of the 
convergence results use a more or less straightforward extension of the classical 
Skorokhod problem formulation, see Skorokhod [32 . The necessary material is 
gathered in the appendix to keep the paper self-contained. 

2. The Stochastic Model 

Recall that F^y is the total number of distinct files initially present in the network 
and X^ (t), resp. X^ (t) is the number of files with one copy at time t, the number 
of lost files at this instant. The number X^{t) of files with two copies at time t is 
defined by X^{t) = Fn — Xo{t) — Xi{t). In general it will be assumed that all files 
have the maximum number of copies initially. The copy of a file is lost with rate /i 
and, conditionally on X(^{t) = x, a file with only one copy gets an additional copy 
with rate XN/x. All events are supposed to occur after an exponentially distributed 

amount of time. Under these assumptions {X{t)) =' {X^{t),X^{t)) is a Markov 
process on the state space 

S ^{x = (xo, xi) e N^ : xo + xi < Fn}, 

as mentioned above, with these assumptions, the state {Fn, 0) is an absorbing point 
of the process {X^{t)). 
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For X e N^, the Q-matrix Q^ = (g^(-, •)) of the process (X(i))) is defined by 



(1) 

It is assumed that 
(2) 



q^(x, a; + ei) = 2\ji{F]^ - a;o - a^i), 
q^{x, a; - ei) = A7Vl{j.j>o}, 
q^{x,x — ei + eo) = fJ.xi. 



hm Fn/N = P, 



and one denotes p = -^/m- 

The stochastic differential equations associated to this transient Markov process 
can be written as 



+00 pt 



+00 pt 



(3) X^{t)=Xi;'{0) + J2 ^{^<x-iu-)}K^A<in), 

pt +^ pt 

(4) Xf(i) = Xi^(0)- / l{xfiu-)>o}^Mdu)-J2 ^<x-iu-)}^,Adu) 

Jo ^^i Jo 

+00 „t 

+ 51 / l{»<F«-Xo"(«-)-Xf(«-)}-^2^.«(d"): 

where (A/'^,i) and (A/'2^,i) are two i.i.d. independent sequence of Poisson processes 
with respective parameters ^ and 2/x, A/aat is an independent Poisson process with 
parameter AA^. For the ith file having only one copy, the integrand of the right 
hand side of Relation ^ corresponds to its definitive loss and the first term of the 
right hand side of Relation Q is associated to its duplication. The last term of 
Relation Q represents the loss of a copy of files with two copies. 
Relation (jH]) can be rewritten as 



(5) 



X^^ it) = X^' (0) + M Xf (u) du + M^' (t), 



where (Mq (f)) is the martingale defined by 

+ 00 ;.* 



-T"-^ Pt 

Kit) = ^ / ^{^<x«iu-)} [-^^-(d") - Mdw] , 

its increasing process is given by 

{M,''{t))^^, fx^{u)du, 
Jo 

in particular, since X^{u) < F/^, there exists some constant Cq such that 

E (M^itf) - E {{M^it))) < CoNt 
holds for alH > and A^ > 1. 
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Similarly, if / is in Cc(N), the set of functions with finite support on N, Rela- 
tion (j4]) gives the representation 



(6) f{X^{t))=f{X^{Q)) + f, [f{X^{u)-l)-f{X^{u))]X^{u)du 







N I n 

Jo 



Fm Xmu)+X^^u) 



{f){X{u))du + Mi^t), 



N N 

where , for y > 0, i^[y] is the functional operator defined by 
(7) n[y]if)ix)^2^lyif{x + l)-f{x)) + Xl^^yo}if{x-l)-fix)), x £ N, 
and {M{^ (t)) is a martingale such that, for some constant Ci, 

E{M^{tf) <Ci7V||/||oo,f„i, 
holds for ah i > and A^ > 1, where ||/||oo,_f„ = max{|/(a;)| : < a; < i^„}. 

3. The Overloaded Network 

In this section, it is proved that a significant fraction of files is lost quickly if 
the network is not correctly dimensioned, i.e. when the ratio p — X/fi is less than 
2/3. In this case, for a large N, the fraction of files with two copies at time t, 
(Fjv — X^{t) — Xi{t))/N is close to p/2 if t is large enough. As a consequence 
(/3 — p/2)N files are lost and the network stabilizes with a subset of files with two 
copies whose cardinality is of the order of p/2. This is the critical case which is 
analyzed in Section |4] where it is proved that the number of files lost files is of the 
order of y/N. When p > 2/3, no file is lost at the fluid level. This case is investigated 
precisely in Section [5l 

Theorem 1 (Fluid Equations). // {X^ {0),X(^{0)) is some fixed element of S 
and limN^+oo Fn/N — j3 then the sequence of processes {X^{t)/N,X(^{t)/N) 
converges in distribution to 

j [{13 - p/2){l - 2e-f'* + e-^A't)^ (2/3 - p) (e"^* - g-^^*)] if p < 2/3, 
j(0,0) */p>2/3. 

Proof Let {X^{0),X^{0)) = (yo,yi) g S. Equations © and (gl), with the func- 
tion / = Id on [0, F]^], can be written as 

^ ' AT AT r" I AT 



N N "^ Jo N /V ' 



N N "^ Jq \ N N 

/•* Xf (u) Mf{t) /•* 

Doob's Inequality and the bounds on the second moments of the associated mar- 
tingales show that, for i = 0, 1 and t > 0, 

A'l^(s) \ 1 , , ,9, 1 Qt 

o<s<t iV J e^ N e-^ 

Therefore, the two sequences of processes {M^ {t)/N) and [Mi {t)/N) converge in 
distribution to uniformly on compact sets. 
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For T > 0, 5 > and for i = 0, 1, define wJ^n{S) as the modulus of continuity 
of the process {X[^{t)) on the interval [0,T], 

(10) <«W= sup |xf(i)-Xf(s)|. 

0<s<t<T, |t-s|<(5 

By using the fact that, for some constant C, X[^{t) < Fn < CN for all iV e N and 
t > 0, the above equations and the convergence of the martingales to give that, for 
any e > and 77 > 0, there exists i5 > such that the relation V[w^„ {5) > rj) < e 
holds for all N. 

This implies that the sequence of stochastic processes {X^{t)/N,X^ {t)/N) is 
tight. See Billingsley jjSj for example. One denotes by {xo{t),xi{t)) a limiting value 
for some subsequence (iV^). From Equation ([5]), one gets the relation 

xo{t) = /i / xi{u) du. 
Jo 

Define 







N N N N 



Equation © can be also interpreted as the fact that 

(12) (Xf (t), <(i)) '= ix^{t)/N, A /* l{;,«(„)^o} d^ 







is the unique solution of the Skorokhod problem associated to the process {Z^{t)). 
See Appendix for a definition. 

The sequence {Z^'' (t)) is converging in distribution and by the continuous map- 
ping theorem 

nt 

(13) lim (Z^'-it)) = y{t) ''= /x / (2/3- 2xq{u) - 3a;i(M)) du - Xt 

= {2^1/3 - X)t - 11 (3xi{u) + 2^i xi(u) dw j 



du 



The solutions of Skorokhod problems being continuous with respect to the process 
(Z^(i)), see Appendix D of Robert [29] for example, one gets that (Xf (i),i?f (t)) 
converges in distribution to the solution {xy{t),ry{t)) of the Skorokhod problem 
associated to {y{t)). Since Xy{t) = xi{t) and y{t) — F{xi){t) with 

Fix){t) ^{2^1/3- X)t- H (3x{u) + 2n x{v) dv\ du, 

the process {xi{t)) is a solution of the generalized Skorokhod problem (GSP) as- 
sociated to the functional F. See Appendix. Proposition |3] shows that such a 
solution exists and is unique. This implies that there is a unique, deterministic 
limiting value for the sequence (X^ (t)/N,X(^ {t)/N). It is easy to check that the 
explicit expressions for (xo(t)) and {xi(t)) given in the statement of the theorem 
are indeed the solutions of the GSP. The convergence in distribution is therefore 
established. D 
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4. The Critical case 

To complete the picture of the overloaded network p < 2/3, one considers the 
critical case p — 2(3. As it will be seen, the convergence result is expressed in 
terms of a reflected stochastic differential equation. The appendix presents the 
corresponding definition and a result of existence and uniqueness. 

Theorem 2. If X/p — 2/3 and, for some 7 G R, 

lim — r= ( Fm — N— I = 7 and lim ;=^- — w, 

N-^ + OD y/f^ V 2/ N^ + OO yTV 

and X^ (0) = 0, then for the convergence in distribution 

lim f ^, ^B) = (p f Y{u) du, Y{t) 



where {Y{t)) is the solution starting at y of the stochastic differential equation 

(14) dY{t) = V2X dB{t) + ^ ( 27 - 3Y{t) -2p Y{u) du\ dt 

reflected at 0, i.e. with the constraint that Y{t) > 0, for all t > 0. The process 
(-B(i)) is a standard Brownian motion on R. 



The solution of SDE ([T4| . is non-Markovian due to the integral term in the drift. 
Proof. Equations ^ and ([5]), with the function / = Id on [0,i^Ar], can be written 



as 



(15) x^,it)^^■^B=pf^du^''^^^'^ 



N Jo VN VN 



[' X^{u) , M^it) ^ r- f' 



with 7JV = {Fn — Np/2)/y/N. With the same notations as in Section [2l the 
martingales {M^{t)) and {M^{t)) are 



M^i*) = E / ^{.<x-iu-)}Wt^Adu) - p du] 
i=i -^o 
+00 „j 

- M^{t) - f l|^„(„)>o}[-^A7v(d«) - XN du]. 
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Their increasing processes are given by 



1 



Mf 



Mi 



N 



it) 
it) 



2^ 



N 



du, 

N 



N 



du 



Mo" 



it) 



+ X 



l{xf(«)>o} 



du. 



The last term of the right hand side of the above equation is (i?^(i)) defined by 
Equation (J12p in the proof of the previous theorem. It is the second component 
of the solution to the Skorokhod problem associated to the process {Z^{t)) of 
Relation (fTTj) . It has been seen that the sequence of processes {Z^ {t)) is converging 
to (j/(i)) defined in Equation ([T^. In this case (y(i)) is identically 0, the solution 
of the corresponding Skorokhod problem associated to (j/(i)) is therefore (0,0). 
The continuity properties of the solutions of the Skorokhod problem imply that 
the process {R^ (t)) converges to 0. Consequently, by Theorem [T] one gets the 
convergence in distribution 

nt 



lini 

JV->-+oo 



1 



{Xf(u)=0} 



du =0 



and therefore 
lini 

W-S-+00 



AC 



tN , 



(t)\ =0 and lim 

dcf. 



-^M( 



N 



it) = (2At). 



One deduces that (M;' (t)) "=' {M^{t)/^/N) converges to {V2XB{t))) where (5(i)) 

is a standard Brownian motion and that (Mq (t)) =' {M^ /y/N) converges to 0. 
See Ethier and Kurtz [^ for example. 
One now proves that the processes 

^,~((,^ dot, ^x„"(()^ „.,, /-N,,,-^ drf, fx^'m 



(x~,<,) m- (^) ... (xr„) -M ( 



are tight. If ih{t)) is a function M-|-, one denotes. 



\h\\ 



sup \h{s)\ 

Q<s<t 



and w^(-) is the modulus of continuity of h defined by Equation (fTU|) . Equation P^ 
gives, for < t < T, 



x!! 



< 



oo,i 



M 



'N 



M 



x^ 



du. 



Equation (fT6|) shows that {X^ [t) / \/N) is the first coordinate of the solution of the 
Skorokhod problem associated to [Z^ {t)) defined by 



dcf. 



X^{u) , X^{u) 



M^{t) 



(17) Zi^t)=yN + fiJ^ [2^N-3 ^^ 

with yjv = X(^ {0)/^/N. By using the explicit representation of the solution of a 
Skorokhod problem in dimension 1, one has 



i-r^AT 



Xi||oo.t<2|lZf 



N\ 



for < i < T, 
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see Appendix D of Robert [29] for example, then 



Xf 



< 2yN + A^nT + 2 



Ml 



4/x 



< U'^ (T) + (4 + /xT)^ 



■N 



X, 



10 

du, 



dcf. 



tN , 



X 



tN , 



■N 



Xf 



Au 



with C/^(T) =■ 22/^ + 4/i7jvT + 2||M;' ||oo,t + 4/iT||Mo' ||o,,t. 

_/V 

Gronwall's Inequahty gives that the relation \\Xi \\oo,t < U [T) exp((4+/^T)/it), 

holds for < t < T, and, consequently. 



^0 



< ^Tf/^(r)e(4+''^)^* + 



Mn 



The convergence of martingales shows that the two sequences of random variables 



tN 



{U (T)) and ||Mq ||oo,t converge in distribution. Consequently, for e > 0, there 
exists some K > such that for i = Q, 1 and all iV > 0, 

p(||Xf||oo,t>/^) <e. 
If ?7 > 0, there exists iVo and S sufficiently small so that, for all N > Nq, 

2fiST{-fN + 2K) < ri/2 and P [wI-n >v)<£- 

_/v 

The last relation coming from the fact that the sequence {M^ {t)) is converging in 

distribution to a continuous process. One gets finally 



f(wIn{S) yri) <p(2nST 



< 



< 



> K 



IN 



-N 



Xr 



xf 



oo,T 

> K 



-N 



Xi 



oo,T 



+ wL^^iS)>v 



^+p(w^,{S)>^/2) <3e. 



The sequence {Z^ (t)) is therefore tight, by continuity of the solution of the Sko- 
rokhod problem the same property holds for {Xf{t)/^/N) and consequently for 
{X^{t)lVN). 

If {Yo{t),Yi{t)) is a limit of a subsequence [(Xo^'=(i)/\/]Vfc,Xf '=(i)/^/lVfc)]. By 
Equation (fTSj) and (fTH]). one gets that 






XVN 



i, ^{^fH=o} 



du 



converges in distribution to the solution of the Skorokhod problem associated to 
the process 

y + V2XB{t)+fi (2j-3Yi{u)-2fi Yi{v) dv\ du 

One concludes that {Yi{t)) is the solution of the generalized Skorokhod problem for 
the functional F defined by 

F{h){t)=y + V2XB{t)+n I (2j~3h{u)~2n I h{v) dv) du. 

Proposition [3] in the appendix shows that there is a unique solution (Yi(t)) and 
consequently a unique limit {YQ{t),Yi(t)). The theorem is proved. D 
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5. The Time Scales of the Stable Network 

The asymptotic properties of the network are investigated under the condition 
p — A//i > 2/3. In Section [3] it has been shown that, in this case, the system 
is stable at the fluid level, i.e. that the fraction of lost files is 0. Of course this 
does not change the fact that the system is still transient with the absorbing state 
(Fjv,0). To have a precise idea on how the system reaches this state, there are 
three interesting time scales to consider: 

(1) Slow time scale: t -^ t/N, 

(2) Normal time scale: t —>■ t, 

(3) Linear time scale: t — >■ Nt, 

they are investigated successively in this section. The following elementary lemma 
will be used throughout the section. 

Lemma 1. If p — X/p > 2/3, for any /3o > /3 such that X/p > 2/3o, e > 0, 77 > 
and T > 0, there exists Nq e N such that 

(1) Coupling: there exists a probability space where the relation 

X^it)<Lp„iNt),yt>0, 

holds for all N > Nq and t > 0, with (L^f, (t)) the process of the number of 
customers of an ergodic M /M /\ queue with arrival rate 2/i/3o and service 
rate X and with initial condition L^q(O) = Xf'(O). 

(2) The relation 



1 f*^ 

0<s, t<T, JV Jjjv 

\t-s\<S 



<£ 



holds. 



Proof. There exists some Pq > P and Nq > 1 such that A > 2/3o/x and that F^ < 
N/3o for N > Nq. It is enough to take the M/M/1 with arrival rate 2p(3o and 
service rate A. 

Denote by A the event on the left hand side of the last relation to prove. If, for 
X G N, Tx denotes the hitting time of x by the process {Li3g{t)), for 5 < 1/2, one 
has 

P(^) < P (tl^jvj < NT) . 

By ergodicity of this process and Proposition 5.11 of Robert [29] for example, there 
exists some < a < 1 such that the sequence (a^r^^^j ) converges in distribution. 
The last term of the above relation is thus arbitrarily small as N gets large. D 

The slow time scale. A description of the asymptotic behavior for the slow 
time scale is presented informally. From Relation ([1]), one can see that the Q- 
matrix of the process on the slow time scale {X^ (t / N) , X^ {t / N)) has the following 
asymptotic expansion 

(q^ix,x + ei) =2/i/3, 

lim <q'^{x,x-ei) = Al{^j>o}, 

A— >+oo 

\q'^{x,x-ei+eo) =0. 



12 MATHIEU FEUILLET AND PHILIPPE ROBERT 

With elementary arguments which are skipped one can easily get the following 
proposition. It states that, on the slow time scale, with probability 1 no file is lost 
at all in the limit. 

Proposition 1. The sequence of processes {X^ (t/N), X^ (t/N)) converges in dis- 
tribution to the process (0,i^(i)), where {Lf}{t)) is the process of the number of jobs 
of an M /M /I queue with arrival rate 2/i/3 and service rate A. 

The normal time scale. It is shown that, on the normal time scale, the stability 
does not only hold on the fluid level: almost surely there is a flnite number of losses 
in any flnite time interval, more precisely losses occur as a Poisson process. See 
Proposition 131 The capacity XN of the network is thus able to maintain an almost 
complete set of files. The following proposition shows in particular that the number 
of definitive losses at time t > is finite with a Poisson distribution. 

Theorem 3. If p = X/p > 2/3, 

— the sequence of processes {X^ (t)) converges in distribution to a Poisson 
point process on R+ with rate 2iJ,f3/(p — 2/3). 

— For t > 0, as N goes to infinity, the random variable X^ (t) converges in 
distribution to a geometric distribution with parameter 2/3/ p. 

The second convergence is for the marginal distribution of {X^{s)) at time 
t. One cannot expect a convergence in distribution of the sequence of processes 
{X^{t)). Indeed, since the sequence of processes {X^{t/N)) is converging in dis- 
tribution to the law of the M/M/1 process [Lpit)), for < s < t, the distribution 
of (Xf (s),Xf (t)) and of {Lp{Ns),Lp{Nt)) are close. Between time Ns and Nt, 
the M/M/1 "forgets" its location at time Ns (just because it hits with proba- 
bility 1) so that when N goes to infinity the couple {X^ {s) , X^ {t)) converges in 
distribution to the distribution of two independent geometric distributions. The 
sample paths of a possible limit of [X^ {s)) would not have regularity properties. 



Proof. Define 



7?jv(i) = / Xf(u)du, 



for < s < t, the above lemma gives that, 

VNit) - riN{s) ^ / X^{u)du< / Li3g{Nu) du^ — / Lpg{u) du. 

Js Js ^^ JNs 

The criteria of the modulus of continuity and Lemma [1] give that the sequence of 
processes (?7^(t)) is tight. The above inequality and the ergodic theorem applied 
to the ergodic Markov process [Lp^it)) show also that, almost surely, 

(18) limsup / X^{u) du < ^^" t. 

N^+oo Jo p— 2/3o 

For T > fixed and K 



\X^{T) >K)<Flfi X^{u)> K/2 + P {M^(T) > K/2) 

<ritij X^{u)>K/2\+^e(p f X^{u)du 
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One can thus choose K so that F{X^{T) > K) < e holds for N > No for some 
A^o G N. As in the proof of Lemma [U for 5 > 0, there exists some Ni G N such 
that if iV > iVi then 

P( sup Lp„{s) >5n] <e. 
\a<s<T J 

In the same way as in the proof of the above lemma, one can construct an M/M/1 
process {Z^ (t)) whose arrival and service rates are respectively 

2u I S ] and A, 

and such that, on the event, 

At = \x^{T)<K, sup Lp„{t)<6N 
L a<t<NT 

the relation Xf (i) > Z^{Nt) holds for aU t < T. Hence, almost surely, 

(19) liminf ?7^(t) > liminf — / Z^ (u) Au ^ -^il^—lL-t. 

^ ' w^+oo ' ^ ' - N^+ooN Jo ^' p-2{l3-S) 

holds on At- By letting 5 go to and /3o to f3 in Equations ([T5)) and p^ re- 
spectively, one gets that the variable i]^ {t) converges almost surely to at with 
a = 2/3/x/(p — 2/3). Consequently, the tightness of the sequence of processes (?7^(t)) 
implies that it is converging in distribution to (at). 

Note that t i-)- X^{t) can also be seen as a point process with jumps of size 1. 
By Equation ([3]), one has 

X^{t)-fi f X^{u)du 

is a martingale with respect to the natural filtration of the associated Poisson 
processes. The random measure 

/■* 
iNnn +n _ ,, / vN/ 



A'^[0,t])=fi / Xmu)du 
Jq 

is a compensator oi the point process 1 1-> X^{t). See Kasahara and Watanabe [TO] , 
It has therefore been shown that the sequence of compensators is converging to the 
deterministic measure a dx. Theorem 5.1 of [19 , see also Brown ^, gives the 
convergence in distribution of {X^ (t)) to a Poisson process with rate a. 

In a similar way as before, through the convergence of the Q-matrix, the as- 
ymptotic distribution of X(^ (t) can be easily obtained by conditioning on the event 
{X^{t) < K} for K large and by using arbitrarily close M/M/1 processes at equi- 
librium as upper and lower stochastic bounds for X^ {t). Details are skipped. D 

The linear time scale t — > Nt. On the linear time scale, it will be shown that a 
fraction '^{t) of the files is lost at time t. In some way the linear time scale gives a 
picture of the decay of the network. 

For iV > 1, the random measure jin on N x M_|_ is defined as, for a measurable 
function g : N x R+ ^ R+, 

W,g)= / g{X^{Nt),t)dt. 
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Note that ii g{x,t) = h{x)l^[Q^T]}{t) for T > 0, then 

Consequently (pn) is relatively compact sequence of random Radon measures on 
N X IR+. See Dawson [7] for example. Note that the measure identically null can 
be a possible limit of this sequence. 

From now on, one fixes (Nk) such that {fJ.Nk) is a converging subsequence whose 
limit is v. By taking a convenient probability space, one can assume that the con- 
vergence of (fJ-Nk ) holds almost surely for the weak convergence of Radon measures. 

Since, for A^ > 1, /i^v is absolutely continuous with respect to the product of the 
counting measure on N and Lebesgue measure on M-|_ , the same property holds for 
the limiting measure v. Let {x,t) — >■ TTt{x) denote its (random) density. It should 
be remarked that, one can choose a version of TTt{x) such that the map (w, a;, t) — >■ 
Trt{x){uj) on the product of the probability space and N x IR+ is measurable by 
taking nt{x) as a limit of measurable maps, 

7rt(a;) = limsup — z^({x} x [t,t + s]). 

See Chapter 8 of Rudin 31 for example. 

Proposition 2. For the convergence in distribution of continuous processes 

X^^iu) du\ = (*(t)) "i^- Uj (tt,,,/) du 

where I{x) = x for x G N. Moreover, almost surely, for all t > 0, 

t 
7r„(N) du = t. 



It must be noted that the last relation is crucial, it shows that the masses of 
the measures /xat^, for A: > 1, do not vanish at infinity. This property is sometimes 
absent of the proofs of stochastic averaging principles, it is nevertheless mandatory 
to identify 7r„ as an invariant distribution of a Markov process. 

Proof. The criteria of the modulus of continuity is used to prove the tightness of 

cNt \ 




<*»<"'"■ (^r 



xmu)du\ . 

y JQ J 

By Lemma [T] 

^N{t) *^(,) = ^ / Xf (u) du < -^ / L^,{u) du. 
^v Jns ^^ Jms 

As in the proof of Proposition [S] one concludes that the sequence of processes 
(*iv(i)) is tight. 

For K > and i > 0, the almost sure convergence of the measures (uNk) gives 
the convergence 

hm — / X^''{u)l[„,K]{X^''{u))du= (7r„,/l[o.Al)du, 

fc^+oo Nk Jo ' ' ' Jo 
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where I{x) = x. By using again Lemma [1] one gets that 
1 f^"* J. 1 Z"^'' 

and the ergodic theorem apphed to {Lpg(t)) shows that the last quantity is con- 
verging in distribution to 



tE(i0„(oo)l 



{Lf<g(^)>K} 



where Lp^{oo) is the hmit in distribution of {Lpg{t)), a geometrically distributed 
random variable. For e > Q, K \s chosen sufficiently large so that the last quantity 
is less than e/2, consequently if k is large enough, one has 



1 fNkt 

One deduces that (^(i)) is the only possible limiting process for (^Arj.(i)). This 
proves the first half of the proposition. 

For K >1, the convergence of [iJ-Nk) gives the relation 



I l-Nkt rt 

. 1™ W Ij^iv (u)<K} d« = '^([O'^] X [0,i]) = / ^ui[0,K]) dz 



lim -^ / 
By using again the stochastic domination by an ergodic M/M/l queue, 

2^ rNkt ^ r-Nkt 

by letting k go to infinity one gets that, almost surely. 



iP(L0„(oo) < i^) < j 7r„([0,i^]) du < / 7r„(N) dw, 
Jo Jo 

now if K go to infinity, one obtains the relation 

TTsiN) ds ^t 
holds for alH e N and consequently for all t > 0. The proposition is proved. D 

Theorem 4 (Rate of Decay of the Network). If p = X/fj, > 2(3, then, as N goes 
to infinity, the process {X^ {Nt)/N) converges to (^(t)) where ^(i) is the unique 
solution 2/ e [0, /3] of the equation 

(20) (l-y//3)''/'e^+^* = l. 

For t > 0, the process {X^{Nt + u),u > 0) converges in distribution to the sta- 
tionary process of the number of jobs of an M/M/l queue with service rate A and 
arrival rate 2/i(/3 — ^(i)). 

It is easily seen that the asymptotic expansion ^(t) ~ /3 — /3exp(— 2(/3 + fit)/p) 
holds as t goes to infinity. The last part of the theorem states that, "around" time 
Nt, the process X(^ has a local equilibrium. 
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Proof. Equation dH) gives that, for / e Cc(N) 

(21) f{X^{Nt)) ~ fiX^m - M^'jNt) = N^ f n[YN{u)]{f){X{Nu)) du 



la 

+ fiN [ A-{f){X^{Nu))X^{Nu) du 
Jo 



with, from Equation ([5]), 



Fat X^jNu) M^jNu) fi '"^^ 

TV iV TV iV 



Y^i-) = "^ - -^^^r^ - -^"""iF^ - "^ 1^ M'iv)du, 



and A~(/)(a;) = (/(a; — 1) — f{x))l{x>i}- The bound on the increasing process of 
the martingale {Mf'^{t)) at the end of Section [51 Doob's Inequahty and Lemma [1] 
show that the sequence of processes 



1 



f{x^{Nt)yf{x^{o)yMf'^,{Nt) 



-^lN f {A-if){X^iNu))X^{Nu) du 
Jo 

converges to for the topology of the uniform norm on compact sets. 
By using Lemma [TJ one gets that 

X^jNu) ^ Lp,XN^u) 

N - N ' 

hence the sequence of processes [X^ {N'^u)/N) converges in distribution to 0. 

The bound on the increasing process and Proposition [2] show that the sequence 
of processes iYNk{t)) converges in distribution to (/3 — ^(i)). One deduces from 
Equation ([2T|) that the sequence of processes 

n[p-^{u)]U){x^''{Nku)) du") = (j ^[p-^{u)]U){x)^lNAdxAu) 

converges to 0. 

The convergence of the (/^jv*.) and Proposition [2] give therefore that, almost 
surely, the relations 

t nt 

(7r„,f][/3-*(u)](/)) dM==Oand / 7r„(N) d?/ = t, 
Jo 

hold for alH > and all functions / S Cc(N). Note that one has used the fact that 
Cc(N) has a countable dense subset for the uniform norm. 

If A is the subset of all real numbers u > such that one of the relations 

Uu{n) + 1, 

\(^„, r![/3 - ^{u)\{!)) ^ 0, for some / G Cc(N), 

holds, then the Lebesgue measure of A is 0. Hence if u ^ A, then 7r„(N) = 1 and 
{■KuMP - *(^t)](/)) = for all / e Cc(N). Since n[(i - ^(u)] is the infinitesimal 
generator of an M/M/1 queue with arrival rate 2/x(/3 — '^(u)) and service rate A, 
one gets that tTu is a geometric distribution on N with parameter 2/z(^ — \l/(u))/A. 
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From Proposition [2] one gets that, for i > 0, 

J[OA]\A Jo A-2/i(/3-*(M)) 

straightforward calculus gives the relation 

(/3 - i/;(u))''/2e'^'(") = pP/'^e'"''. 

It is easily checked that since 2/3 < p, there is a unique ^(u) < /3 satisfying the 
above equation. The theorem is proved. D 

The theorem gives directly the following corollary on the asymptotic behavior of 
Tjv(^), the first time when a fraction 6 of the files has been lost. 

Corollary 1. If p = X/p. > 2/3 then, for N > 1 and S e (0, 1), 

TN{S)^mi{t>0:X^{t)>SFN}, 

then, for the convergence in distribution, 

TMSl^lfp 
N^+^ N p\ 2 ^^ ' ^ 

Appendix. Generalized Skorokhod Problems 

For the sake of sclf-containedness, this section presents quickly the more or less 
classical material necessary to state and prove the convergence results used in this 
paper. The general theme concerns the rigorous definition of a solution of a sto- 
chastic differential equation constrained to stay in some domain and also the proof 
of the existence and uniqueness of such a solution. See Skorokhod [3? , Anderson 
and Orey [1], Chaleyat-Maurel and El Karoui [8] and, in a multi-dimensional con- 
text, Harrison and Reiman |17j and Taylor and Williams |33) and, in a more general 
context, Ramanan [2T. See Appendix D of Robert JW for a brief account. 

We first recall the classical definition of Skorokhod problem in dimension K. If 
{Z{t)) is some function of the set 2?(M+,M) of cadlag functions defined on R_|_, the 
couple of functions [(X(t)), (i?(i))] is said to be a solution of the Skorokhod problem 
associated to {Z{t)) and P whenever 

(1) X{t) = Z{t) + R{t), for all t > 0, 

(2) X{t) > 0, for all t > 0, 

(3) t -^ R{t) is non-decreasing, i?(0) = and 



[ X{t) AR[t) = 0. 



The generalization used in this paper corresponds to the case when {Z(t)) is itself 
a functional of {X{t)). 

Definition 1 (Generalized Skorokhod Problem). 

IfG : X'(M+,M) -^ X'(M+,IR) is a Borelian function, {{X{t)), {R(t))) is a solution 
of the generalized Skorokhod Problem (GSP) associated to G if {{X{t)), {R(t))) is 
the solution of the Skorokhod Problem associated to G{X) and P, in particular, for 
all t > 0, 

X{t) = G{X){t) + R{t) and [ X{t) dR{t) = 0, 
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The classical Skorokhod problem described above corresponds to the case when 
the functional G is constant and equal to {Z{t)). If one takes 

G{x){t)^ f <j{x{u)) dB{u)+ f 6{x{u))du, 
Jo Jo 

where {B(t)) is a standard Brownian motion and a and d are Lipschitz functions 
on R. The first coordinate {X{t)) of a possible solution to the corresponding GSP 
can be described as the solution of the SDE 

dX{t) = cT{X{t)) dB{t) + 5{X{t)) dt 

reflected at 0. 

Proposition 3. If G : V(R+,m) -^ V(R+,R) is such that, for any T > 0, there 
exists a constant Gt such that, for all {x(t)) G 2?(]R+,M) and < t < T, 

(22) sup \\G{x){s) - G{y){s)\\ < Gt [ \\x{u) - y{u)\\ du, 

0<s<t Jo 

then there exists a unique solution to the generalized Skorokhod problem associated 
to the functional G and the matrix P. 

Proof. Define the sequence (Xjv(t)) by induction {X°{t), i?°(t)) = and, for iV > 1, 
(X^+i, i?^+i) is the solution of the Skorokhod problem (SP) associated to G(X^), 
in particular, 

X^+\t) - F (X^) (t) + R'^+^t) and / X^+\u) dR^'+^u) = 0. 

Jr+ 

The existence of such a solution is guaranteed as well as the Lipschitz property of 
the solutions of a classical Skorokhod problem, see Proposition D.4 of Robert [59], 
this gives the existence of some constant Kt such that, for all A^ > 1 and < t <T, 

\\X^+^-X^\\ <Kt\\F(X^)~F(X''-')\\ ,, 
where ||/i||oo,T — sup{|/i(s)| : < -s < T}. From Relation ((12), this imphes that 

\\X^+^-X^\\ < a t WX"" ~ X''-^ du, 

M Moo,r — / II lloo,it ' 

Jo 
with a — KtGt- The iteration of the last relation yields the inequality 

\\X^+i_X^\\ <M^/'*||xi|| du, 0<t<T. 



loo.t - ^! j^ 

One concludes that the sequence (X^(t)) is converging uniformly on compact sets 
and consequently the same is true for the sequence {R^{t)). Let {X{t)) and {R{t)) 
be the limit of these sequences. By continuity of the SP, the couple {{X{t)), (R{t))) 
is the solution of the SP associated to G{X), and hence a solution of the GSP 
associated to F. 

Uniqueness. If {Y{t)) is another solution of the GSP associated to F. In the 
same way as before, one gets by induction, for < t < T, 

(at)^ <•' 



\X-Y\\^^^< ^, 



7 \\X-Y\\^^.^du, 
Jo 
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and by letting N go to infinity, one concludes that X = Y . The proposition is 
proved. D 
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